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Abstract 

Gallager codes are the best error-correcting codes to-date. In this paper we study them 
by using the tools of statistical mechanics. The corresponding statistical mechanics model is 
a spin model on a sparse random graph. The model can be solved by elementary methods 
(i.e. without replicas) in a large connectivity limit. For low enough temperatures it presents 
a completely frozen glassy phase (qEA = 1)- The same scenario is shown to hold for finite 
connectivities. In this case we adopt the replica approach and exhibit a one-step replica 
symmetry breaking order parameter. We argue that our ansatz yields the exact solution of 
the model. This allows us to determine the whole phase diagram and to understand the 
performances of Gallager codes. 
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1 Introduction 



Information theory [|I],|2| deals with the problem of reliable communication through an imper- 
fect (noisy) communication channel. This can be done by properly encoding the information 
message in such a way to increase its redundancy. If a transmission error occurs due to the 
noise, the correct message can be restored by exploiting this redundancy. 

The price to pay for error-correction to be possible is to increase the length of the transmit- 
ted message, i.e. to decrease the information rate through the channel. In 1948 C. E. Shan- 
non [|| computed the maximal achievable rate at which information can be transmitted through 
a given communication channel (the so-called capacity of the channel) . Since then a lot of work 
has been spent for constructing practical error-correcting codes that could realize Shannon pre- 
diction, i.e. that could saturate the channel capacity. 

In the past few years it has become progressively clear that such an objective is not un- 
reachable. It has become possible to construct error-correcting codes which remain effective 
extremely near to the Shannon capacity Q. The reasons of this revolution have been the 
invention of "turbo codes" || and the re-invention of "low-density parity check codes" (LD- 
PCC) ||. The last ones were proposed for the first time by R. Gallager in 1962, but were 
soon forgotten afterwards, probably because of the lack of computational resources at that 
time. 

As it has been shown by N. Sourlas PHlC|], error-correcting codes can be mapped onto 
disordered spin models. This mapping allows to employ statistical mechanics techniques to 
investigate the behavior of the former. Both turbo codes |TT|,|T2]] and LDPCC [|13|-|T9| have 
been already studied using this approach. However all previous studies were restricted to 
particular regions of the phase diagram. The principal technical reason was the difficulty of 
implementing replica symmetry breaking in finite connectivity systems. 

In this work we focus on regular Gallager codes (a particular family of LDPCC), and we 
address the fundamental problem of determining the corresponding phase diagram. There are 
two type of motivations for such a task to be undertaken. First, the spin model corresponding 
to Gallager codes is a disordered spin model on a diluted graph. The study of such systems 
has greatly improved our understanding of glassy systems over the last few years. Second, 
it is of great practical importance to have a complete quantitative picture of the behavior of 
Gallager codes. For instance, the existence of a glassy phase can have important effects on the 
decoding algorithms, and the knowledge of the phase diagram can be used to improve them. 

The model is presented Sec. ||| In Sec. || we prove some exact properties which hold at 
inverse temperature (3 = 1. The line (3 = 1 can be regarded as the Nishimori line |2(| of the 
phase diagram. In Sec. |I] we solve the model in the large connectivity limit. We show that 
it becomes identical to a simplified model which we call the random codeword model (RCM). 
The RCM is shown to have a freezing phase transition analogous to the one of the random 
energy model (REM) plfl . In Sec. |5| we adopt the replica approach [^] and prove that the 
same scenario applies for finite connectivities. In particular we construct a replica symmetry 
breaking solution of the saddle point equations. The proposed solution is much simpler than 
the generic one-step replica symmetry breaking solution. Rather than being parametrized by a 
functional over a probability space [23], it depends simply upon the probability distribution of 
a local field. Such a probability distribution can be easily computed numerically. It can be also 
obtained from a large connectivity expansion, see Sec. ^. In Sec. |7| we compute the finite-size 
corrections of the free energy for the RCM, and compare the result with exact enumerations. 
Finally in Sec. || we discuss the validity of our replica symmetry breaking ansatz. 
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2 The model 



Let us suppose we want to transmit an information message consisting of L bits. There are 
2 L such messages. Each of them is encoded in a string of N > L bits (codewords). 

This motivates the following model. There are 2 L possible configurations of the system (the 
codewords), each one corresponding to a distinct sequence of N > L bits. We shall denote the 
codewords as = (x^*\ . . . , x^ ), with a = 1,...,2 L . The set of codewords C is a linear 
space. This means that = (0, . . . , 0) G C, and that, if x^ a \x^ G C, then x (q) + x^> G C 
(where the sum has to be carried modulo 2). 

Like any linear space, the set of codewords C can be specified as the kernel of a linear 
operator. In other words, we can find an M by N matrix C = {Cij}i=i,,,M,j=i...N, with 
Cij = 0, 1, and M = N — L, such that 

C = {z (a) : a = 1,...,2 L } = {x G {0,1}^ : Cx = 0(mod2)}. (2.1) 

The condition Cx = (mod 2) can be regarded as a set of M linear equations (called constraints 
or parity checks) of the form: 

Cnx! + C i2 x 2 + ... + C iN x N = (mod 2) , (2.2) 

with i = l,...,M. 

To each bit Xi, i = 1, . . . , N, we assign an a priori probability distribution Pi(xi). In the 
information-theory context, the a priori distributions Pi(xi) are induced by the observation of 
the channel output, and by the knowledge of the statistical properties of the channel. We are 
interested in studying the induced probability distribution over the codewords x^ a h In other 
words we want to consider the following probability distribution over the strings x of N bits: 

1 N 

P(x) = -5[Cx]l[p i (x l ), (2.3) 

i=l 

where Z is a normalization constant; 5[z] = 1 if z = (mod 2), and 5[z\ = otherwise. 

There are several graphical representations of the above model. The most used in the 
coding theory community makes use of the so-called Tanner graph [^J], cf. Fig. |]. This is a 
bipartite graph which is constructed as follows. A node on the left is associated to each binary 
variable xj, and a node on the right to each constraint, i.e. to each linear equation fl2,2|) with 
i = 1, . . . ,M. There are therefore N left nodes (variable nodes), and M right nodes (check 
nodes). A given check i is connected to the variables Xj which appear with nonzero coefficient 



in the corresponding equation (2.2) 



The model (2.3) has a spin-wise formulation Jl^-19| which we shall employ hereafter. We 



replace any bit sequence x = (x\, . . . , xn), with a spin configuration a = (<ti, . . . , ctn), where 
<7j = (— l) x \ The constraints ( |2.2D on the sums of bits Xi, get translated into constraints on 
the product of spins o~i. These have the form 

°^ = n °i = +i ' ( 2 - 4 ) 

where u>i = {j G {1, . . . , iV} : Cij = 1}. The other ingredient of the model are the a priori 
probability distributions Pi(xi). They can be encoded into properly chosen magnetic fields: 
Pi( x i) = e^ hiCTi j (2 cosh Phi), with 2/3/ij = log(pi(0)/pi(l)), where we introduced the inverse 
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Figure 1: Two Tanner graphs: a regular one with (k, I) = (6, 3) on the left, and an irregular one on 
the right. In both cases iV = 8, M = 4 (and therefore the rate is R = 1/2). 



temperature (3 for later convenience. With these building blocks, we can write down the spin 
model equivalent of Eq. (pT 



1 M ( N \ 

P(S) = II ^ , +1] exp \P h ^J > ( 2 - 5 ) 

where 5[a, b] is the Kronecker delta function. This can be regarded as a spin model with infinite 
strength multi-spin interactions (which enforce = +1) and a random magnetic field. 

Instead of insisting on the motivations for the probabilistic model ( |2.5| ) coming from coding 
theory, we shall remark that, as it stands, it is remarkably general. Any spin-model hamiltonian 
H(a) = j Ji t ...i p o'i 1 ■ ■ ■ <Ji p can be written in the form fl2,5|) . This can be done by 

introducing the auxiliary spin variables ; . The Kronecker delta functions in Eq. (|2.5| ) can 
be used to enforce er^...^ = a% x . . . <7j p . The couplings J^...^ become magnetic fields acting on 
the variables cr^...^. 

Untill now we have been pretty generic in the presentation of the model. In order to be 
more precise, we have to choose the constraint matrix C, and the magnetic fields {/ii}i=i,...,Ar. 

Following Gallager ]7j|, we shall take C to be random and sparse. More precisely C will be 
constrained to have k non-zero elements for each row and / non-zero elements for each column 
(with I < k), and not to have two identical rows^j- This choice corresponds to taking the Tanner 
graph (cf. Fig. 0) as a random bipartite graph, with variable (left) nodes of fixed degree /, 
and check (right) nodes of degree k. We shall choose among the matrices of this ensemble 
with flat probability distribution. We shall use the pair (k, I) to denote the spin model (or the 
error-correcting code) defined by this ensemble of matrices. An important characteristic of the 
code is its rate R = 1 — l/k, which measures the redundancy of the encoded message (infact 
R = L/N). 

The magnetic fields hi will be random i.i.d. variables with probability distribution Ph(hi). 
We consider ph(hi) to be biased towards positive values of hi (i.e. JdhiPh(hi)hi > 0). We 



Remark that, with this choice, some of the parity check equations ( |2.2| ) may be linearly dependent. However, 
such an event is rare for k > I . 
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shall refer often to two simple examples: the two-peak distribution 

p h (hi) = (1 - p)5(hi - h ) + P S(hi + h ) , (2.6) 
with p < 1/2 and ho > 0, and the gaussian distribution 



1 f (hi — h 



\2 



Ph(hi) = — = exp <^ \ , (2.7) 

^2^2 I 2h 2 J 

with ho > 0. It can be shown that, if the model describe communication through a noisy 
"symmetric" channel, the condition 

Ph(-hi) = e- 2hi p h (hi) (2.8) 

follows. This implies ho = (l/2)log(l —p)/p for the example (|2.6|) (which corresponds to a 



binary symmetric channel), and ho = h? for the example (2.7) (corresponding to a gaussian 
channel). Hereafter we shall denote with (■)f l and (-)c the averages with respect to the magnetic 
fields {hi}, and the ensemble of matrices C. 

More details on the model introduced in this Section, and on analogous examples can be 
found in Refs. fTM! 



3 The Nishimori line 

Nishimori [2(], 2f| showed that the physics of disordered spin models simplifies considerably 



on a particular line in the phase diagram. In particular, it has been recently shown [26] that 
replica symmetry breaking is absent on this line. The Nishimori line plays a distinguished 
role in the correspondence between error-correcting codes and disordered spin models. As 



shown in Refs. [27,2^], maximum a posteriori symbol probability (MAP) decoding for a given 
error-correcting code is equivalent to computing expectation values on the Nishimori line of 
the corresponding spin model. 

In this Section we extend the results concerning the Nishimori line to the model fl2.5|) . We 
shall consider a generic magnetic field distribution Ph(hi) satisfying Eq. ( |2.8| ). In this case 
the Nishimori line is simply given by (5 = 1. Although the proofs are very similar to the ones 
of Refs. (2^,^6|, we present them for sake of completeness. Some consequences of the exact 
results of this Section will be outlined in Sec. |5[ 

Let us start with some convention. Notice that there are two sources of disorder in our 
model ( |2.3[ ): the magnetic field hi (which is determined by the channel output), and the check 
matrix C. Different C correspond to different error-correcting codes. In this Section we keep 
the parity check matrix C fixed, and average uniquely over the random magnetic fields {hi}, 
with distribution ph(hi). Our results will remain valid after averaging with respect to any 
ensemble of check matrices C (i.e. to any ensemble of codes). It is convenient to introduce the 



notation <5c[cf] to denote the product of Kronecker delta functions in Eq. (|2.5|) . In other words 
^cfcl] = 1> if and only if a satisfies all the parity checks encoded in C, i.e. if the corresponding 
string of bits x is a codeword. We assume that the parity check matrix C selects 2 L = 2 NR 
codewords. This means that there are 2 L distinct configurations <r, such that 5c[a] = 1. Finally 
we shall take the distribution of the random fields to satisfy the identity ( ^.8|) . 

We start by writing down the definition of the (field averaged) free energy density fc(P) 
for a given parity check matrix C: 

-PNfc(P)= Hdhip h (hi) log^actde^iMa • (3.1) 

J -°° i=l [a J 
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Then we notice, following Ref. |2q| , that the integral over the field hi can be decomposed into 
an integral over its absolute value and a sum over its sign. Using Eq. ( |2.8D , we get, for any 
function 0(hj) 

/+oo r+oo 
dh lPh (hi)0(hi)= / dhipQi^Y^e^OihiTi), (3.2) 
-oo J 

where p(hi) is given by 

PQH) = P^+P^-W . (3.3) 
2 cosh hi 



By using the decomposition (3.2) into the definition (|3.1| ), we get 

- PNfciP) = Hdhi p(hi) eEi kiTi lo S { E ^[£] W« I . (3.4) 

To be more compact, we shall use hereafter the shorthand (-) p = J +O °n£i dhi p{hi) (•) for the 
average over the absolute values of the fields {hi}. 

The next step consists in performing a gauge transformation Tj — > cr^Tj, <7j — > o^crj. Because 
of the constraint term <5c[£lL the free energy ( [3.4| ) is not invariant with respect to such a 
transformation for a generic choice of {cr^}. However, if 5c [21'] = 1> i- e - if is a codeword, 
then the gauge transformation leaves invariant the free energy. We can sum over all such 
"allowed" transformations, and divide by their number, namely 2^^, obtaining 

-WfM = ^^EE^^^.^logj^^^e^.W^ > (3 . 5) 

where the constraint <5c[ct'] force the gauge transformation a' to be an allowed one. 

In Eq. (|3.5| ) we wrote the sums over quenched and dynamical variables in a symmetric 
form. This allows to derive several exact identities for /3 = 1, where the symmetry is complete. 
In particular, let us consider the internal energy per spin ec(/3) = dp((3fc((3)). From Eq. (| 
we get 



TV \ 1=1 / 



*C(P = 1) = " > .. ) . <k2) I - ) .. h,T,0; I C~ ) . CUi) 

We can now perform a second gauge transformation n — > Ti<Ji, sum over the {<7j} using the 
constraint, and finally sum over the Tj. We obtain ec(j3 = 1) = — (/itanh/i}^. Analogously to 
Ref. 1 25], we can further simplify this result, obtaining 

e c (P = l) = -(h) h , (3.7) 

which is the first important result of this Section. 

We want now to prove the absence of replica symmetry breaking on the Nishimori line of 
our model fl2.3j), i.e. for f3 = 1. As in Ref. |26||, we consider the magnetization distribution 



P^(m) = J ^ [[dhipUh^ ___^- 7 _ , 
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and the overlap distribution 

V Sc\sH Sck'} e^hm+PZth^ 5{q _ N -i £. ^ 



p^fe) - n^^(^) -- ^^^^^^ 

(3.9) 



As before, we keep the parity check matrix C fixed. We shall prove that the two probability 
distributions defined above are indeed identical on the Nishimori line (3 = 1, i.e. P^(x) = 

(2) 

P^(x). Since the probability distribution of the magnetization is expected to be a single delta 
function^ [22|, this implies the absence of replica symmetry breaking for (3=1. 
We begin by using the decomposition (|3.2|) in Eq. (|3.8|). This yields: 



P W , m) l T .T, ^ jEjj ^' — " m ~ JY " 1 Id \ (3 10) 

- ' p 

Then we notice that the above distribution is invariant under an "allowed" gauge transfor- 
mation Tj — > cr^Tj, o"j — > o^cTj. As before, "allowed" means that 5c[£.1 = 1- We can therefore 
average over these transformations, obtaining 

(3.11) 

We then insert 1 = (Yls- ^cW\^ ,ihiTiai ) I ^2a' $c\<l!]e£' ihiTiCr ' i ) , perform a second gauge trans- 
formation Tj — > CTjTj, o"j — > CTjCTj, o"^ — > (?jcr£, and sum over a. Finally we set (3 = 1, obtaining 

P^(m) = Pi^(m), as anticipated above. 



4 The random codeword limit 

The limiting case k, I — > oo, with l/k = 1 — R fixed, plays an important role. We shall call it 
the random codeword limit for reasons which will be clear later. It is a non-trivial limit since 
the redundancy of the error-correcting code is kept fixed. From a theoretical point of view, it 
allows a simple solution of the model without changing its qualitative features. Our methods 
will be similar to the ones used by Derrida to solve the REM |2l|]. Finally, we will show that 
the corrections for finite values of k and I are exponentially small in k. Therefore this limit is 
interesting also from a quantitative point of view. 



4.1 The limit k, I — ► oo 

Let us consider the probability for a given sequence of bits x = {x\, ... ,xn) to be a codeword 
with respect to the ensemble of parity check matrices C. This coincides with the probability 
P a for a given spin configuration a to satisfy the constraints (|2.4|). In other words: 



1 M 

^AfcEIPt^+l], (4-1) 

^ C j=l 



'Notice that our model (2.3) has no spin-reversal symmetry. 
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where the sum over C runs over all the matrices of the (k, l)-ensemble , and Afc is their number. 

Clearly P a depend upon a uniquely through the magnetization m a = (1/N) a i- 111 
general it has the form 



P a ~ exp 



NY^ ,l \m 



(4.2) 



The function T,^' l \m) is computed in Appendix ^ for general values of k and I, and is not 
particularly illuminating. However, in the limit k, I — > oo, l/k = 1 — R fixed, we have 

S( fc ,,)(m)->-(l-12)log2, (4.3) 

for any — 1 < m < 1. In other words any spin configuration a has the same probability 
Per ~ 2~( 1 ~ R ^ N of being a codeword. In addition we must keep track of the completely ordered 
configurations o~i = +1 for i = 1,...,N, and Oi = —1 for i = 1,...,N. The positive one 
satisfies the all constraints for any k and I, and for any matrix C (this configuration is quite 
important for the thermodynamics of the model). The negative one satisfies the constraints 
for k even, but it is irrelevant for the thermodynamics. 

Let us now turn to a slightly more complicated quantity. We consider the joint probability 
P(j )T for two different spin configurations r and a to satisfy the same set of constraints (|2,4|), 
corresponding to some matrix C taken from the (k, I)- ensemble. In formulae: 

1 M 

p °->r- = jr E II ^ > , +i] • (4.4) 

C C j=l 

As before we can argue that P a ^ T depends upon a and r only through their magnetizations 
m a , m T , and their overlap q = (1/N) <5iJi- The form of P a T in the thermodynamic limit is 

P^ z ~ exp[N^ l) {m a ,m T ,q)} . (4.5) 

The function T^ ,l \mi : m2 : q) is computed in Appendix [A]. Again, we shall not report here 
the result, but we remark that in the k, I — > oo limit 

T^' l \m 1 ,m 2 ,q) -» -2(1 - R) log 2 , (4.6) 

for any —1 < mi,ni2,q < 1. In other words, the probability for two configurations a, and r 
to satisfy the same set of constraints is P aT ~ P P T ~ 2~ 2( - 1 ~ R ^ N : the two configurations can 
be regarded as independent ones. 



4.2 The random codeword model 

The previous considerations allow us to replace (in the k, I — > oo limit) the original model 
([2. -51), with the following random codeword model (RCM). The model has 2^^ possible states 
which we shall index with the letter a = 1, . . . , 2^^. To each of these states we associate a 
random spin configuration aj- a ) = (o^ , . . . , crj^ ) . By random we mean that each spin af 1 ^ is 
chosen independently from the others, and that = +1 or —1 with equal probability. Let us 
underline that, in the random codeword model, the a\°^ are quenched variables, the dynamical 
one being the index a. There is a second set of quenched variables: the magnetic fields hi, 
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e 



Figure 2: The microcanonical entropy density of the RCM with binary field distribution, cf. Eq. 
( |2.6|) . Here we set R = 1/2, p = 0.025, h = arctanh(l — 2p). Notice the continuous contribution 
coming from the random configurations (solid line), and the isolated ordered configuration (filled 
circle). 



with i = 1, . . . , N. As in the original model we take them to be random i.i.d. variables with 
distribution ph(hi). The energy of the state a reads 

JV 

= -Y,h^ a) . (4.7) 

i=l 

To the 2^ "disordered" states described above we add the ordered state a = 0, and the 
corresponding spin configuration (J^\ with = +1 for i = 1, . . . ,N. This corresponds to 
the "all zeros" codeword 0. Its energy is obviously = — ^ hi. 

The random codeword model can be solved through elementary methods. Here we shall 
solve it for the ±/io distribution of fields, see Eq. (^lj|). At the end of this Section, we shall 
quote the result for a general distribution Ph(hi). For sake of clarity we shall report the 
calculation for this case, which is slightly less straightforward, in the Appendix [B|. 

We begin by taking into account the "random" states a = 1, . . . ,2 NR . Later we shall 
consider the contribution coming from the ordered state a = 0. Let us consider a fixed 
configuration of the magnetic fields {hi}. Since the probability distribution of the is 
flat, P({(7^}) = 2~ n2r , we can apply a gauge transformation — » Eio\ a \ with £j = ±1, 
without changing their statistical properties. If we choose £j = sign(/ij), the energy ( p~7| ) 
becomes E^ = —hoj^i^i^- We conclude that, for what concerns the "random" states, the 
±/io field distribution is equivalent to an uniform field hi = ho. 

Now we would like to compute the typical number J\ft yp {e) of states having a given energy 
density E^ /N = e. This is equal to the typical number of states having magnetization 
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mS a ^ = —e/tiQ. This is a very simple problem. Define the function 

H(x) = log(l + x)- ^ log(l - x) . (4.8) 

Then Aft yp (e) ~ exp{iV\Rlog2 + NH(e/h )}, when |e| < e c , and Nt yp {e) = otherwise. The 
critical energy e c = hoe(R) is the positive solution of i?log2 + TC(e/ho) = 0. The entropy 
density of the system s(e) = log Aft yp (e)/N is depicted in Fig. |2[ Since s'(— e c ) > the 
(sub)system of the random codewords undergoes a freezing phase transition at the critical 



temperature (3 C = s'(—e c ). This phase transition is analogous to the one of the REM [21]: it 
separates an high-temperature paramagnetic phase from a low-temperature frozen one. 

Let us now consider the ordered state a = 0, whose energy is given by = — ^ i h{. In 
this case we can apply the central limit theorem. For N — > oo the energy density of the state 
a = is e^ ) = — (1 — 2p)ho with probability one. We have therefore the following picture of the 
energy spectrum of the model: a single ordered state at e*- ^ = — (1 — 2p)ho, plus a bell-shaped 
continuum between — e c (/io) and e c (ho). The ordered state is thermodynamically relevant as 
long as it is separated by a gap from the continuum. This happens if p < p c (R), where p c (R) 
is the unique solution between and 1/2 of the equation 

Rlog2 + H(l-2p) = 0. (4.9) 

Notice that Eq. (|4.9| ) coincide with the equation determining the capacity of the binary 
symmetric channel [Q. This means that, in the k,l — > oo limit, Gallager codes saturate 
Shannon capacity. 

The free energy is easily determined from the entropy: 

f(j3) =min(e-4*(e)j • (4-10) 





The phase diagram includes three different phases: a paramagnetic (P) and a spin-glass (SG) 
phases, associated with the continuum part of the energy spectrum; a ferromagnetic (F) phase, 
associated with the ordered state. The free energy of the paramagnetic phase is given by: 

fp(P) = -|log2- 1 log cosh (3h . (4.11) 

The paramagnetic-spin glass phase boundary is given by the zero-entropy condition dfp/d/3 = 
0. We obtain the curve (3ho = arctanh(l — 2p c {R)) = h*(R). At the transition the system 
freezes and the free energy in the spin-glass phase is 

fsa(P) = f P (P = h*(R)/h ) = -Ml - 2Pc(R)) ■ (4.12) 

The ferromagnetic free energy is nothing but the energy of the ferromagnetic state: 

f F (P) = -Ml " 2p) • (4.13) 

The ferromagnetic-spin glass phase boundary has therefore the simple form p = p c (R). 

For sake of clarity, let us consider the magnetic field distribution which describes a binary 
symmetric channel, i.e. let us fix ho = ho(p) = arctanh(l — 2p), cf. Eq. fl2,8|) . The resulting 
phase diagram is reported in Fig. |3|. The ferromagnetic-spin glass phase boundary is at 
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Figure 3: The phase diagram for binary (left, see Eq. (|2.6|)), and gaussian (right, see Eq. ( |2.7| )) 
field distribution. In both cases the field distribution was chosen to satisfy Eq. ( |2.8|) . 



p = p c (R)- The paramagnetic-spin glass boundary is /?arctanh(l — 2p) = arctanh(l — 2p c (R)). 
Finally the ferromagnetic-paramagnetic phase boundary is given by 



R log 2 + log cosh /3/iq (p) — Pho (p) tanh ho (p) = . 



(4.14) 



The triple point is at (5 = 1, p = p c (R), and lies on the Nishimori line. 

Untill now we treated the simple case of a two-peak distribution of the magnetic fields: 
Ph(h-i) = (1 — p)$(hi — ho) +p5(hi + ho)- What does it happen for a generic Ph(hi)? In 
Appendix [B] it is shown that the same scenario applies with some slight modification. The free 
energy in the paramagnetic phase becomes 



fpiP) = -|log2-i(log cosh/3%. 



(4.15) 



The system undergoes a freezing transition at a critical temperature (3 C determined from the 
condition df /df3\^ = 0. For j3 > (3 C , the system is in a glassy phase with free energy 
fsc(P) = fp(Pc)- Finally, the ferromagnetic phase coincides with the ordered state a = 0, 
and has free energy /f(/3) = —{h)h- 

To be specific we report in Fig. || the phase diagram for the gaussian distribution 

1 , (4.16) 



h ^ 



which describes a gaussian channel with noise variance w. The triple point is located at (3 = 1 
and w = w c (R), w c {R) being the solution of the equation below 



R log 2 + (log cosh h)h — (h tanh h)h = . 



(4.17) 



It is easy to show that the solution R(w) of the above equation correspond to the capacity of 
a gaussian channel with constrained binary inputs Pi- 



ll 



5 The replica calculation 



As always [g2) we compute the integer moments (Z n )h^ of the partition function by replicating 
the system n times. To the leading exponential order we get 



where 



(Z n )h,c ~ jU dX(a)dX(a) e ~ NS ^ , (5.1) 



^ 7 n 
S[X,X] = *Z>(i?)A(<?)-- £ X(a x ) • . . . • A(* fc ) ]J 6[o$ . . . a%, +1] 



ai,...,d k 



0=1 



-log|^A(ay(e^^^|_ / + ^ (5 . 2) 
and ct = (cr , . . . , c n ) is the replicated spin variable. The calculations which lead to Eq. (| 



are completely analogous to the ones of Refs. [17, 19|. To be self-contained we shall sketch them 
in Appendix [C]. The free energy f((3) is obtained by taking the saddle point of the integral 
(let say A = A*, A = A*) and evaluating the n — > limit: (3f(f3) = lim n _^o d n S[X^, A*]. 
The saddle point equations are 

n 

X(a) = A(^i)-...-A(^_ 1 )IJ5[a fl ^...^_ 1 ,+l], (5.3) 



<ri,...,cr k _ 1 



a=l 



The above equations are satisfied by the totally ordered solution Ao(<?) = Ao(<?) = 8a,# , 
where ao = (+1, . . . , +1). The corresponding free energy is /f(P) = —{h)h- Such a solution is 
is possible because of the infinite-strength ferromagnetic interactions in our model ( |2.3| ). Phys- 
ically it is related to the configuration {<7j = +l}i=i l ... ) jvi which satisfies all the constraints^]. 

5.1 Stability of the ferromagnetic phase 

In the ferromagnetic solution found above (as in the ferromagnetic phase found in Sec. |j) the 
system is completely ordered (i.e. the magnetization is m = 1). This correspond to no-error 
communication in the coding language. Knowing the boundaries of the ferromagnetic phase 
is therefore of great practical relevance. Here we shall investigate the issue of local stability. 
The calculation is similar (although much simpler) to the one carried out for turbo codes in 



Ref. 112 



We start by computing the replicated action (|5.2|) for X(a), A (a) "near" the ferromagnetic 
saddle point, namely A(<?) = Ao(<?) + S(a), X(a) = Xo(a) + 5(a). We first consider the case 
/ > 2: 

5S[X , A ] = I " \Kk ~ 1) E 6 & 2 + \ l ^o) 2 + 0(5 3 ) , (5.5) 



3 Notice that, for k even, there are 2™ solutions of the type X(a) — X(a) — 5s.?. The "spurious" solutions with 
t ^ (To are related to the {<Ji = — 1}»=i,...,jv configuration. Since we took (h)h > 0, these solutions do not have 
thcrmodynamical relevance. 
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where <5«S[Ao, Ao] = S[\q + 5, Ao + 5] — S[Xq, Xq]. It is convenient to integrate over X(a) using 
the saddle point equation (|5.3| ), which, for X(a) = Xo(a) + 5(a), X(a) = Xo(a) + 6(a), gives 
5(a) = 5(a) /(k - 1) + 0(5 2 ). We finally get 



5S[M = \Y,Cs5(B) 2 + 0(5 2 ), 



(5.6) 



where Q = lk/(k — 1), and = l/(k — 1) for a ^ uq. We conclude that, for I > 2, the 
ferromagnetic phase is always locally stable and its boundaries must correspond to first order 
phase transitions. 

For I = 2 the situation is physically different. Equation (5.6) is still valid, with <^ = 
2k/(k - 1) and 



1 



k-1 



{e Phn )h 



(5.7) 



for a ^ ao. We have therefore n different eigenvalues ( n ,u>, with degeneracies 



where 



1. The corresponding critical line is given 



lo = n — ^ a a a . The first instability occurs for oj 
by (k - 

community, although it has been obtained by completely different methods. 
Hereafter we shall focus on the case / > 3. 



l)(e ^ ch )h = 1- This local stability condition is already known [29] in the coding 



5.2 Replica symmetric approximation 

The simplest approximation for treating the n — > limit, consists in choosing A(<r) and X(a) 
to be replica symmetric, i.e. to depend upon a uniquely through the symmetric combination 
^2 a a a . A commonly adopted parametrization |^0) is the following 

and the analogous one for X(a) (with a different distribution vf(y)). The replica symmetric 
order parameters it(x) and 7? (y) have the physical meaning of probability distributions of cavity 
fields. In particular 

P(H) = JdxTr(x) Jdyn(y)5(H - x - y) , (5.9) 

is the probability distribution of the effective fields Hi = (l//3)arctanh(fjj). 
Using the ansatz ([5.8|), we easily obtain the replica symmetric free energy: 

/3/p[vr,vf] = ^log2 - (log cosh + I j dxir(x) j dyn(y) log[l + tp(x)tp(y)] - 

- l - J dxi 7r(xi) ... J dx k ir(x k ) log[l + tp(x x ) . . . tp(x k )} - 

dyiT?(yi)... dy l Tr(y l ){log¥i(h,y 1 ,...,y l ;P)) h , (5.10) 
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where we defined tp{x) = tanh/3x and 



/ 



I 



Fj(yo, 2/1, ...,yi;0) = I^ 1 + ^(v*)) + IT^ 1 " Wvi)) ■ 

i=0 i=0 

The field distributions n(x) and tt (y) are determined by the saddle point equations: 



(5.11) 



7r(x) 



cfeci 7r(xi) . . . / dx k _i 7r(x fc _i) 5 



1 



y - -arctanh(t /3 (xi) . . . tp(x k -i)) 



dyi 7f(yi) • • • / dyi-i 7r(y{_i)(5(aj -h-yi~ yi-i)) h ■ 



(5.12) 
(5.13) 



The above equations can be solved either numerically or in some particular limit. In the next 
Section we will see that the expansion around the random codeword limit provides rather 
accurate results. 



5.3 One step replica symmetry breaking 

To go beyond replica symmetric approximation, one has to divide the n replicas into n/m 
subgroups of m replicas (with 1 < m < n). The order parameters A(er), and A(<?) depend upon 
a through the n/m variables a a = J2T= m (a-i)+i ° a ■ As discussed clearly in Refs. |23|,[n|], in 
the n — > limit the order parameter becomes a functional over a probability space and the 
calculations becomes rather cumbersome (see Refs. |H,32| for two viable approaches) 



In our case there exists a very simple solution to the saddle point equations flO|) , (| 
incorporating one step replica symmetry breaking: 

r f)xy\ n t™ s a n / m am 

Hg) = J2j^ m(x) ^—=L_ U n H°',s°], (5.14) 

{s a } v ' ' a=\ o=(a-l)m+l 

and the analogous one for A(a) (with a different distribution Tr m (y)). It is easy to see that 
the above ansatz satisfies the saddle point equations as soon as ir m (x), v? m (y) are solution 
of the replica symmetric equations ( 5.12j ), ( |5.13j ), with the substitution h — > mh. The phase 



described by the solution ( 5.14j ) is completely analogous to the spin-glass phase found in 



the random codeword model. The system is frozen in a large number of "optimal" con- 
figurations (with self-overlap qEA — 1)- The overlap between two such configurations is 
q = Jdx ir m (x) jdy 7? m (y) tp(x + y). 

Such a simple scenario (and the simple solution ( |5.14j )) is possible because the multi-spin 
interactions of the model ( |2.5D have infinite-strength. The existence of other replica-symmetry- 
breaking solutions is an open issue, see Sec. ^. In the next Section we will show that our ansatz 
gives back the RCM solution, see Sec. ||], in the k, I — > oo limit. 

The free energy of the solution ( 5.14] ) is fsG,m{P) = fp(f3m), see Eq. ( 5.10 ), and has to be 



optimized over m with < m < 1. This procedure yields the spin-glass free energy fsc{P) 
fp(ftc), and m = /3 c /f3. The critical temperature j3 c is given by the marginality condition 
dmfsG,m(P)\m=i = 0, which coincides with the zero-entropy condition dpfp((3)\p = p c = 0. 
Let us now draw some consequences of our solution ( 5.14| ) for the phase diagram of the 



model. Since both the spin-glass and the ferromagnetic free energies are temperature inde- 
pendent, the ferromagnetic-spin glass phase boundary must stay parallel to the temperature 
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axis. If, for instance, we consider the binary field distribution ( |2.6| ) with ho = arctanh(l — 2p), 
this boundary is simply given by p = p c (k,l). Moreover we notice that the energy density 
on the line (3 = 1, see Eq. ( |3.7| ), is equal to the ferromagnetic free energy. This implies 
that the entropy vanishes at the ferromagnetic-paramagnetic boundary for (3 = 1. Since the 
paramagnetic-spin glass boundary is determined by the zero entropy condition, this point must 
be the triple point. In synthesis, the main characteristics of the phase diagram depicted in 
Fig. H remain valid for finite connectivities. 



6 Large /c, / expansion 

Here we show that the replica solution exhibited in the previous Section goes to the random 
codeword model solution (cf. Sec. |j) when I, k — » oo at l/k = 1 — R fixed. Moreover we want 
to stress that this limit can be useful from a quantitative point of view. In fact, the corrections 
for finite k are exponentially small in k. 

Notice that the free energy in the spin glass phase fsc(P) is easily obtained from the para- 
magnetic free energy fp(/3). In fact we have /sg(P) = fp(Pc), where the freezing temperature 
f3 c is given by the zero-entropy condition dpfp{(3) = 0. Moreover the ferromagnetic free energy 
is /f(P) = —{h)h, and does not depend upon k and /. It is then sufficient to solve Eqs. fl5,12j ), 
( |5.13 ) for large k,l and evaluate Eq. ( 5,1C| ) on the solution. The result is fp Xp \f3) (exp stands 



for "expanded"), and allow to reconstruct the whole phase diagram as explained above. 

The expansion is obtained by noticing that the product tp{x\) ■ . . . -tp(xk-i) which appears 
on the right-hand side of Eq. ( |5.12| ) is exponentially small in k as long as ir(x) is supported 
on finite values of x. We then expand the the right-hand side of Eq. ( |5.13 ) for small values of 



y and plug the result in Eq. ( 5.12 ). 



The calculations are straightforward. For sake of simplicity we show some consequences 



for the two-peak field distribution (2J3). We refer to Appendix [D] for the general results. 

In Fig. U we report the modified phase diagram for k = 6, / = 3, as computed us- 
ing the expansion of Appendix |d| (cf. Eq. ( p. <j )) for the paramagnetic free energy. We 
consider the two-peak distribution ( |2.6| ) with ho = arctanh(l — 2p). The paramagnetic/spin- 
glass boundary is obtained by imposing the zero-entropy condition dpf p exp \j3) = 0. We set 
f^Q P \f3) = fp Xp \j3 c )- The ferromagnetic spin-glass, and ferromagnetic/paramagnetic bound- 
aries are obtained by imposing Jf{/3) = fgQ r (/?), and /f(/3) = fp Xp \(3)- 

The triple point is at /? = 1, p = p c (k, I). As we stressed in Sec. the line (3 = 1 is of great 
practical importance, since it correspond to a widespread decoding procedure (MAP decod- 
ing). The critical noise p c (k,l) has the meaning of the threshold for no-error communication 
under MAP decoding. Since the ferromagnetic-spin glass phase boundary stays parallel to the 
temperature axis, p c (k, I) is also the threshold for any "finite-temperature" decoding [g7j f° r 
P > 1. We get 

Pc(k > l)=P ° c ~ 47^T=^°j (1 " ^ + " 2p ° c)ik) ' (6 ' 1} 



where the function 7i(x) has been defined in Eq. 4^. In the k,l — > oo limit, we recover the 
threshold p^ c = p c {R) of the random codeword model, given by the solution of Eq. (4.9). The 
deviations from the optimal properties of the random-codeword model are exponentially small 
for large k. 



Equations (jl| ) and (|1|) can be solved numerically by a "population dynamics" algo- 



rithm. One represents the distributions ir(x) and n(y) by two populations {xj}i=i,...,£ and 



15 




1/(3 

Figure 4: The phase diagram for the (6, 3) code as computed from the large k, I expansion (contin- 
uous lines), and the one of the RCM (dashed lines). The vertical dashed line is the Nishimori line 

(3 = 1. 




0.00 0.05 0.10 0.15 0.20 0.25 

P 

Figure 5: The error probability per bit (filled circles and upper curves), and the entropy (empty 
triangles and lower curves) for the (6, 3) model with binary field distribution ( |2.6| ). We set (3 = 1 and 
ho = arctanh(l — 2p). The symbols are obtained by solving numerically the saddle point equations 
( |5.12| ), ( |5.13| ). The dashed lines are the RCM results. The continuous lines are the results of the 
large-connectivity expansion. 
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{yj}j=i r .. ; £, and then iterates the equations ( 5.12| ) and ( |5.13| ). This method has been already 
used, for instance, in Ref. [31|. In Fig. || we consider once again the line (3 = 1 and compare 
the results of large k,l expansion with the numerical solution of Eqs. (5.12) and ( 5.1 3| ) . We 
plot both the entropy and the average error probability per bit (P e )hc, where: 



1 N 1 



(6.2) 



As conclusion let us consider the problem of calculating the critical noise p c (k,l). This 
can be obtained either by solving numerically Eqs. ( |5.12 ) and ( |5,13| ), or from the expansion 
(U). The numerical solution yields p c (k, I) = 0.0997(2), 0.1071(2), 0.1091(2), for, respectively, 
(k,l) = (6,3), (8,4), (10,5). From the expansion Q we get p e c xp (k,l) » 0.103965, 0.107783, 
0.109195 for the same values of k and I. 



7 Finite size corrections and numerical results 

In this Section we compare the analytical predictions with numerical results in order to confirm 
the validity of the former and to investigate the nature of finite size corrections. Needless to 
say, the last one is a point of utmost practical importance in coding theory. Indeed it is known 
that the thermodynamic limit is approached exponentially fast in the ferromagnetic phase, at 
zero temperature Q. We expect the same behavior to hold in the whole ferromagnetic phase. 

Here we focus on the paramagnetic-spin glass phase transition. We compute the finite size 
corrections to the free energy of the RCM. This calculation is compared with exact enumeration 
calculations on small systems. Then we switch to the complete model ( [2.5| ) and compare the 
the numerical results with the outcome of the replica calculations, cf. Sec. H. 



7.1 The random codeword model 



Let us consider, for sake of clarity, the binary distribution (2.6) with p > p c (R). This cor- 
responds to focusing on the paramagnetic-spin glass phase transition. Under this condition 
the ordered state a = belongs to the continuous part of the spectrum and there is no en- 
ergy gap. We shall therefore neglect this state. Its contribution is exponentially small in the 
thermodynamic limit. 

With this assumption, we obtain the following result for the free energy density 

f(J3,N) = f (p) + jjfi(P,N) + 0{l/N 2 ) , (7.1) 

The leading term has been already computed in Sec. ||. The first correction fi(f3, N) vanishes 
in the paramagnetic phase and depends weakly upon N. Explicit formulae are given in Ap- 
pendix ^. In particular fi(f3,N) ~ (l/2/3 c ) logiV as N — ► oo. The leading correction in the 
paramagnetic phase is exponentially small in N. In order to compute it, the ferromagnetic 
state cannot be neglected. 

It is very easy to compute numerically the finite-iV free energy for the random codeword 
model with binary field distribution ( |2.6| ), as long as we neglect the ordered state. All we need, 
for a given sample, is the energy spectrum. Let us call Uf~, with k = 0, . . . , N the number of 
states a, such that = —ho(N — 2k). The probability distribution of the spectrum fak} 1S 

\n N 

P(M) = T ^f—UPk' ( 7 - 2 ) 

llfe=0^ ! fc=0 
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1/(3 1/P 

Figure 6: Finite size correction to the free energy (a) and to the entropy (b) of the RCM. The 
continuous lines are the results of numerical computations for iV = 40, 80, 120, 160, 200 (error bars 
are not visible on this scale). The dashed lines are the analytical results for the leading finite size 
correction, for N = 40, 200 (a) and N = 200 (b). 



where ^2 k ^k = A/" = 2^, and 

Pk = W ( I ) ■ ( 7 - 3 ) 

Once the {ffc} have been generated with probability distribution ([7.2|), the partition function 
is given by Z(0) = £ fc exp{[3h {N - 2k)}. 

We considered the RCM with rate R = 1/2 and binary field distribution Q2.6| ) with ho = 
arctanh(l — 2p). The phase diagram of this model is depicted in Fig. ||. We fixed the flip 
probability p = 0.2 to be greater than the threshold p c (l/2) w 0.110025, and computed the 
temperature dependence of the free energy by averaging over 10 5 realizations of the spectrum 

Ki- 
ln Fig. |, graph (a), we plot the quantity A/(0,iV) = [f(j3,N) - fo(j3)]N, together with 
the theoretical prediction N) for several values of N. In Fig. |^, graph (b), we consider 
the entropy density s((3,N) = 2 d l3 f{l3 1 N): we plot the difference As((3,N) = [s((3,N) - 
s ((3)]N, for the same values of N, together with si(/3,iV) = (i 2 dpfi{(5, N) for N = 200 (the 
N dependence of s±(/3,N) is rather weak). 

Two remarks can be made by looking at Fig, |(| First, the 0(1/N 2 ) terms in Eq. ( |7,1| ) 
seems to be rather small. If the temperature is not too close to the critical point, the finite 
size corrections are well described by f\((3,N). Second, the curves for Af(/3,N), see Fig. |(| 
graph (a), seem to cross at the critical point. This is expected since Af(f3, N) ~ (l/2/3 c ) log N 
for (3 > p c , and Af((3,N) ~ e~ KN for j3 < f3 c . The crossing point Pn,n' between the curves 
Af([3, N) and A/(/3, N') can be used to estimate j3 c . From the data of Fig. |6| we get 

040,80 = 1-52(1) , 080,120 = 1-51(1) , 0120,160 = 1-51(1), 0160,200 = 1-51(1), (7.4) 



18 



-0.4 



ca 



0.5 j| $ A 



-0.6 



-0.7 



-0.8 



-0.9 



0.0 



0.5 




1/(3 



ca 

05 



0.4 



0.3 



0.2 



0.1 



0.0 



A A A A 



0.0 



0.5 




1/(3 



Figure 7: The free energy (left) and the entropy (right) of the (6, 3) model computed by exact- 
enumeration (symbols), and the corresponding theoretical predictions (continuous lines). The vari- 
ous symbols refer to different system sizes: N = 20 (triangles), 30 (circles), 40 (stars) and 50 (filled 
diamonds) . 



which is in good agreement with the exact result (3 C ~ 1.50794. 

7.2 The (6, 3) model 

In this case we are forced to consider quite small systems since we do not know any simple form 
for the probability distribution of the energy spectrum. We must enumerate all the codewords 
(i.e. the spin configurations which satisfy the constraints in Eq. (|2.5|)): this takes at least 
0(2 NR ) operations. Notice that finding the codewords is a simple task. It suffices to solve 
the linear system Cx = (mod2). A standard method (we used gaussian elimination) takes 
0(N 3 ) operations |33|. 



As in the previous Subsection, we fixed considered the binary field distribution fl2.6| ) with 
Iiq = arctanh(l — 2p), and p = 0.2. In Fig. ^ we plot the results for the free energy and the 
entropy densities for systems of size N = 20, 30, 40 (averaged over N sta t = 1000 samples) and 
N = 50 (with Ngtat = 20 samples). The numerical results converge quite well to the theoretical 
calculation at high temperature. Below the critical temperature the convergence is very slow, 
as expected from the analogy with the RCM example. 

The sizes considered here are too small to reach any definite conclusion on the glassy phase. 



8 Discussion 

The main result of this paper is the determination of the phase diagram of regular Gallager 
codes, see Eq. ( |2 . 5|) . This is depicted in Fig. fj] for the infinite connectivity limit. The phase 
diagram for finite connectivities has been obtained by resorting to the replica method and looks 
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qualitatively similar. The most important quantitative difference is the critical noise level for 
the ferromagnetic-spin glass phase transition. This quantity determines the performances of 
the corresponding code. It can be determined either by solving the mean field equations 
numerically, see Sec. [5], or in a large connectivity expansion, see Sec. ||. The result of the last 
computation is reported in Fig. ^. 

The replica computation was made possible by the particularly simple one-step replica 
symmetry breaking solution exhibited in Eq. ( 5.14 ). We weren't able to prove that the saddle 



point (5.14) is either unique or the dominant one. There are however several independent 
indications which confirm this conclusion: 

• The proposed solution is consistent with the absence of replica symmetry breaking on 
the (3 = 1 line, which has been proved in Sec. [3[ 

• It has been shown [19,[34| that the critical noise level is the same both for zero-temperature 
and for temperature one decoding. This implies that the ferromagnetic-spin glass phase 
boundary must pass through the points (p = p c (k, I), 1/(3 = 0), and (p = p c (k, I), 1/(3 = 
1), see Fig. |] (for sake of simplicity we referred to the case of a binary field distribution). 
This consistent with our phase diagram. 

• Our numerical results, although we restricted to fairly small systems, do not contradict 
our conclusions. 



It can be interesting to notice that recently [35| a "factorized ansatz" has been proposed as an 
exact one-step replica symmetry breaking solution for some diluted spin models. The solution 



used in this paper is, in some sense, complementary to the one of Ref. [35| 
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A Codewords in the /c, / —> oo limit 

In this Appendix we compute the one-codeword, and two-codeword probabilities, see Eqs. 
( |4.1| ) and (4.4), for generic values of k and I. Then we show that, in the k,l —> oo limit, 
different codewords become statistically independent, i.e. P a<T ~ P a P T - 
The one-codeword probability is, to the leading exponential order: 



P„~ jY[dX{a)dX(a) eKp{NAx(X,X;c)}, 



where 



A 1 (X,X;c) = -lJ2x{a)X(a) + — 

a 

+1^2 c(a) log X(a) +1 



i 

k 



(A.l) 



(A.2) 
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and c(er) = (1/iV) (5 ff|ffj characterizes the configuration a. The above result can be proved 
by noticing that P g _e^p(f3hoYli 'i) = {Z(ho))c, where Z(ho) is the partition function for 
the model ( |2.5| ) with uniform magnetic field hi = h$. The average (Z(ho))c is easily obtained 
from Eqs. ( |5.1[ ) and ( |5.2| ) by setting n = 1 and Ph(hi) = 5{hi — ho). 



The integral (A.l) can be done through the saddle point method. Saddle point equations 
are more conveniently written by eliminating A(cr), and using the variables A+ = E CT X(cr) and 
A- = £ CT A((r)c7. We get: 



A^ + A^ = 2, (A.3) 
A-A^ + A+A^ 1 = 2m, (A.4) 

where m = E CT c(cr)a = (1/iV) Ej cij. For large k, these equations imply A + = 2 l / k + 0(m k ), 
\_ = 2 l l k m + 0(m k ), as soon as — 1 < m < 1. Substituting in Eq. ( |A.2| ), we get the result 
anticipated in Sec. £§ see Eqs. (p~2|), (p~3|). 

Let us now consider the two-codeword probability, cf . Eq. (fO|) • Analogously to Eq. (|A.l[) 
we get: 

Pa,r~ /"ndA(a,r)dA(a,r) exp{ATA 2 (A,A;c)}. (A.5) 

The corresponding "action" is 

A 2 (A,A;c) = -lJ2H<r,T)\(cr,T) + ~ E' E' A ( 

o"i,ti) . . . A(cr fe ,r fc ) + 

CT,T CTl-.-O-fc Tl...T fc 

+Zj^c(<r,r)logA(<r,r)+i-^, (A.6) 

cr,r 

where c(a,r) = (1/iV) Ei S a . jl7 5 Ti)T , and the sums E are restricted to o\ ■ ■ ■ = +1 and 
r\ ■ ■ ■ Tk = +1. As before we notice that JZ^P^r^PiPhi £V ai+(3h 2 Ei n) = (Z(hi)Z(h 2 ))c 
can be obtained through a standard replica calculation, see Sec. |5| and App. [C], with n = 2 
replicas. 

We now define the variables Ao = Y^ar / ^( CJ ' T )' = E ff T% T ) ,T ' = Eut^K 1 ") 1 ") & n d 
Ao- T = X^o-r A((T, t)o"t. The saddle point equations can be written in terms of these variables 
as follows: 

Ag + A*+A* + A* r = 4, (A.7) 
A^- 1 + AoA^ 1 + A^- 1 + A.A^ 1 = 4m CT , (A.8) 
A.A^ 1 + A^- 1 + AqA^ 1 + A^A^ 1 = 4m r , (A.9) 
A^- 1 + A.A^ 1 + A^- 1 + AqA^ 1 = 4q, (A.10) 

where m a = ^ CTjT c(a, t)o = (1/AOEi ^ m r = E ff ,r c (°"> T ) r = 0-/ N )Hi T i, and q = 
E ct , t c(ct,t)c7t = (1/JNOEi^- From E q s - (B-(B), we get. for k ~+ °°> A o - 4 1 /*, 
A CT ~ 4( 1_fc )/ fe m -, A r ~ 4( 1_fe )/ fc m r , A CTT ~ 4( 1 - fe )/ fc gr, as soon as —1 < m a ,m r ,q < 1. The 
corrections to this asymptotic behavior are of order 0(m k ,m k ,q k ). Substituting this solution 
in Eqs. flAg), (|Ag), we get the results (gj), fl4~6|) . 
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1.0 



-1.0 -0.5 0.0 0.5 1.0 

Figure 8: The RCM for Ph(hi) = (2/5) 8(hi — 1/2) + (3/5) S(hi — 1). The continuous line encircles 
the region Q (see text). The dashed line is the curve mi = tanh/3/2, m 2 = tanh/3, which intersect 
the boundary of Q for (3 = f3 c . 



B The random codeword model for a generic field 
distribution 

In this Appendix we solve^ the RCM for a generic field distribution ph{hi). The strategy is to 
start from a discrete distribution 

M 

Ph(hi) = Y,p 9 5(hi-h®), (B.l) 

9=1 

and then approximate a generic Ph{h%) by letting M —* oo. 

Let us consider the distribution QB.1| ). In the typical sample there will be Ni ~ Npi 
sites with field hi = (which we can suppose, without loss of generality, to be the sites 
i = I,..., iVi), N 2 « Np 2 sites with field hi = /i (2) (let us say for i = iVi + 1, . . . , Ni + N 2 ), 
and so on. For a given spin configuration a, we define the partial magnetization m q {a) as the 
magnetization of the sites whose magnetic field is . With the labeling of the sites chosen 
above we get 

1 K 

™ q (<i)^ w Yl °^ ( B - 2 ) 

q i=M q -i+i 

where M q = N\ + . . . + N q . We call {m q (a)} the magnetization profile of the configuration a. 

We now consider the states a = 1, . . . ,2 NR , To each of them it is associated a 
random codeword , where the are quenched variables drawn with flat probability 
distribution. We ask ourselves what is the typical number Mj/pd"*!)}) of states a having a 

I am deeply indebted with B. Derrida who explained to me how to treat this general case. 
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given magnetization profile m q (a^) = m q . The answer is quite easy. Define the function 
G({m q }) as follows 

M 

g({m q }) =Rlog2 + Y,P q nm q ), (B.3) 

9=1 



where H(x) is given in Eq. (|4.8|) . The typical number M w ({m,}) is obtained from Q({m q }) 
through the usual construction: Mt y p({mq\) ~ exp[iV(?({m 9 })] ifQ({m q }) > and Mt yp ({m q }) - 
otherwise. The convex region Q = {{ni q }\Q({m q }) > 0} is depicted in Fig. [8| for the case 
M = 2. 

The energy of a state a can be written in terms of its magnetization profile: E^ a ' = 
-NY.qPqh^mqio^). The free energy density can therefore computed from Ntyp({ m q\) a s 
follows: 



f((3) = min < 

{m q } 



M 



■pGtfmq}) ~ ^Pqh q m q } , (B.4) 



where G{{m q }) = (1/N) log Mt yp {{m q }) (i.e. Q{{m q }) = Q{{m q }) inside O, and G({m q }) = 
— oo outside). 

If the expression ( [B.3D is used in Eq. ( |B.4| ), one gets the saddle point condition m q = 
tanh/3/ig. This describes a curve in the {m q } space which start at m q = for f3 = 0, and ends 
at m q = sign h q for (3 = oo. The corresponding free energy reads 



M 



fp{(3) = -^\og2-^Y^p q \ogco^f3h q . (B.5) 

1 1 q=l 

At some critical temperature (3 = (3 C the curve m q = tanh (3h„ crosses the boundary of Vt. The 
saddle point m q = tanh/?/i g is no longer valid for (5 > (3 C - The critical temperature can be 
computed from the zero entropy condition dpfp\p = p c = 0. For (3 > f3 c the entropy vanishes 
and the free energy is frozen to its value at the critical point: fsG(P) = fp((3 c )- As in Sec. |||, 
we must include in our analysis the ordered state a = whose free energy is /f(P) = —{h)h- 
The solution for a continuous field distribution ph(hi) follows from the above results by 
taking the A4 — > oo limit in Eq. ( |B.5| ). This yields Eq. ( |4.15 ). Alternatively we could 



have started with a continuous magnetization profile m(h) from the very beginning of this 
Appendix. 



C The derivation of Eq. flggj ) 

We start by writing down the partition function of the model fl2.5|) : 

M 

Z(fi) = Y, \{5[a^ ,+l]e^^ . (C.l) 

3. J=l 

We rewrite the constraint term (i.e. the product of Kronecker delta functions) by introducing 
the quenched variables D w = 0, 1, where u = (if, . . . ) runs over the fc-plets of site indices. 
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The variables D u are defined by setting D w = 1 if u = Uj for some j = 1, . . . , M and D w = 
otherwise. With this definition we can write the replicated partition function as follows 



1 N 

= jf E E II ( e ^ a V h III 1 - v» + ^»[^]} > (c.2) 

{D} {ct} i=l w 

where ^ = (Ilr=l ■ ■ ■ ' Ilr=l a $0> ^Pl = ria=i ^I " ' +1]) and A/" is a normalization con- 
stant (to be computed later). 

According to our choice of the ensemble of check matrices, we must impose Ylusi = h 
for any i = 1, . . . , N. This can be done by using the identity 



dz i 1 X»zi D » 
2vri J+ 1 { 



(C.3) 



where the integration path encircles the origin in the complex Zj plane. We get 
i N r a 1 1 

{ct} i=l 7 i u D^=0 



(C.4) 



where z w = Jlie^ The weights w(D u ) have been introduced for later convenience, and cor- 
respond to a rescaling of the {z{\. Their contribution can be readsorbed by the normalization 
constant A/ 7 . We set w(l) = l(k — 1)\/N k ~ 1 and w(0) = 1 — w(l). Now we can sum over the 
Du, obtaining 



(C.5) 



{*} i=i 



• exp < 



Nl 



T £ c z (a 1 )...c z (a k )l[S[^...a a k ,+l}\ , 



ai,...,a k 



0=1 



where c z (cx) = (1/iV) ^ ZiSff,ff v Finally we introduce the order parameter A(c?) and its complex 
conjugate A(<r), by using the following identity 

exp{NF[c]} = ^dA(a)dA(a) exp J -Nl ^ A(a)A(a)+ (C.6) 

8 \ 8 

The use of the above identity allows to integrate over the {zi}, obtaining Eqs. (5.1) and (|5.2|). 
The overall normalization constant can be fixed by requiring (Z n ) ~ 2 Afn ( 1 ~ i / fc ) for /ij = 0. 



D Large /c, / expansion: general formulae 

Let us define t p = (tanh/3/i)^. We assume formally t p = 0(t p ) where t is "small" and expand 
in t k to the order t 3k . All the observables can be expressed in terms of the order parameters 
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ir(x) and vf(y). The solutions of Eqs. (5.12), ( |5.13| ) admit an expansion of the form 

oo oo 

Tr(x)=p h (x)+Y,*mf3- m p { ™ ) (x) ; n(y) = 6(y) + J2*nf3- n S ( - n Hy), (D.l) 



m=l 



n=l 



where (x) = d™ph(x) and 5^ n \y) = dyS(y). Moreover one gets vr m ,7f m = 0{t mk ). The 
results for the first few coefficients are listed below: 



TTi = -(I-l)t? 



7T 2 
7T3 



fc-1 



,2fc-3 



(fc-l)(;-l) 2 (l-< 2 )^ 

1(1 - l)^" 1 - i(fc - l)(fc - 2)(i - 1) 3 (1 - t 2 ftf - 5 - (k - lf(l - 1) 3 (1 - t 2 ftf - 5 



n 2 



7T3 = 



3 V ' * 2 

+ (k - - 1) 2 (£! - h)t k 2 -h\- 2 + (k— - 1) 2 (Z - 2)(tx - t 3 )tf" 4 
i(/-l)^- 1 + i(*-l)G-2)<f- 2 + 

,fc-2,fc-l i / 1 -i \ / / i\2/; r>\/*i + \+3fc-4 



0(i 4fc ) 



+ (fc - - l)^i - t 3 )t K 2 - + (fc - 1)(Z - 1) 2 (Z - 2)(1 - t a )t?* - 4 + 0(i 4fc ) , 

-i(/-i)t*-i-i(i-i)(i-2)^- 1 t*- 1 ' 



A 1 - -(/- l)(Z-2)(Z-3)if- 3 + 0(t 4fe ), 



./c-l 



(k-m-i)(i-t 2 )t( 



2fc-3 



(D.2) 
+ 

(D.3) 

(D.4) 
(D.5) 



1 



(fc-l)(fc-2)(Z-l) 2 (l-i 2 )^ 



2.3fc-5 



{k-iy(i~iy{i~t 2 ytf~ b + 

i 



+(k - 1)(Z - l)(*i - t 3 )tr *i + (fc - - - 2)(t| - i 3 )*! 

1 



- 4 - -tj- 1 + 0(t ik ) , 

3 



tip 1 + (k— 1)(Z - !)(*! - ta)^-^- 1 + 0(t 4fc ) 



fc-2 .fe-1 



.fc-1 

V 3 



0(£ 



4 A- \ 



(D.6) 
(D.7) 



The result for the paramagnetic free energy is 

(3f P (0) = -i?lo g 2-(logcosh / 9/ l ) h -^-^(/-l)(l-t 2 )tf- 2 + i^fe_ 

~(k - - i) 2 (i - t 2 ) 2 4 k - 4 + -m- 2)(h - t z )tf - 3 + 

+Z(Z - - hrf-Ht 1 l -^tl + 0(t ik ) . 



(D.8) 



E Finite size corrections for the random codeword 
model 

Let us consider the binary field distribution with ho = 1. The results for a generic 

value of ho are obtained after a trivial rescaling of energies and temperatures: f((3,ho;N) = 
h f(pho,l;N). 

As explained in Sec. ^, the finite size corrections at the paramagnetic-spin glass phase 
transition can be studied by neglecting the ordered state. This introduces exponentially small 



errors. The calculation of the free energy can be done along the lines of Ref. |21], Appendix 
B, which starts from the identity: 

(logZ}= I™ j (e^ - e~ iZ ) . (E.l) 



" 
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We limit ourselves to quoting the outcome of the calculation. For f3 < (3 C , we get /(/?, N) 
f P (P) + 0{e~ KN )f\. For > f3 c we get Eq. Q, with 



fo(/3) = -e(R) , /i(/3,iV) 



(E.2) 



7 ~ 0.577216 being the Euler constant. The function p{4>) is defined as the (unique) solution 
of 



(3 c p + log *(-JVe + p) = logfa) + ^ log [|iV(l - ? 2 ; 



(E.3) 



where — e(R) is the ground state energy density in the thermodynamic limit, see Sec. |||. The 
function Vt'(x) is defined as follows 



+oo 



-/3 c (2g+x) 



1 — exp ( — e 



= /3(2g+x) 



(E.4) 



g=— oo 



Notice that ^S>(x + 2) = ^(x). The log term in Eq. ( |E.3| ) gives therefore an oscillating 
iV dependence to fi((3,N). Moreover, since ^(—Nt+ p) remains finite for any N and p, 
fx(/3,N) ~ (l/2/3 c ) logiV as — > oo. Finally we remark that the sum in Eq. (|E.4| ) diverges 
as /3 I (3 C . This gives the singularity of the free energy corrections at the critical point: 

/i(&j\0~(i/&)iog(i-/y/3). 
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